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ABSTRACT 

This invention provides a method of cleaving target 
proteins from Caulobacter S-layer protein under mild acid 
5 conditions whereby a fusion protein secreted by the 
raulobacter and comprising the target protein and at least 
a raulobacter S-layer secretion signal may be cleaved at a 
aspartate-proline dipeptide without solubilizing the fusion 
protein. This method may be carried out while the fusion 
10 protein is in an insoluble aggregate which facilitates 
recovery of the protein. This invention also provides a 
method of preparing a DNA construct for expression of the 
fusion protein and a method of preparing the fusion 
protein . 
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CLEAVAGE OF CAULOBACTER PRODUCED 
RECOMBINANT FUSION PROTEINS 



FIELD OF INVENTION 

This invention relates to the expression and secretion of 
recombinant fusion proteins from Caulobacter wherein a 
heterorogous polypeptide -is fused with all or -part of the 
surface layer protein (S- layer protein) of the bacterium. 

BACKGROUND OF THE INVENTION 

Many bacteria assemble layers composed of repetitive, 
regularly aligned, proteinaceous sub-units on the outer 
surface of the cell. These layers are essentially 
two-dimensional paracrystalline arrays, and being the outer 
molecular layer of the organism, directly interface with 
the environment. In Caulobacter , the S-layer protein is 
synthesized by the cell in large quantities and the S-layer 
completely envelopes the cell and thus appears to be a 
protective layer. 



raulobacter are natural inhabitants of most soil and 
freshwater environments and may persist in waste water 

25 treatment systems and effluents. The bacteria alternate 
between a stalked cell that is attached to a surface, and 
an adhesive motile dispersal cell that searches to find a 
new surface upon which to stick and convert to a stalked 
cell. The bacteria attach tenaciously to nearly all 

30 surfaces and do so without producing the extracelluar 
enzymes or polysaccharide "slimes" that are characteristic 
of most other surface attached bacteria. Caulobacters 
have simple requirements for growth. The organism- is 
ubiquitous in the environment and has been isolated from 

35 oligotrophic to mesotrophic situations. They are known for 
their ability to tolerate low nutrient level stresses, for 
example, low phosphate levels. 
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All of the freshwater Caulobacter that produce an S- layer 
are similar and have S- layers that are substantially the 
same under election microscopy. The layers are hexagonally 
arranged in all cases, with a similar centre-centre 
dimension (sees Walker, S.G., et al . (1992). "Isolation 
and Comparison of the Paracrystalline Surface Layer 
Proteins of Freshwater naulobacters" J. Bacterid. 174: 
1783-1792) . 

16S rRNA sequence analysis of several S- layer producing 
r;^^i1r.bacter strains show that they group closely (see: 
Stahl, D.A, et al (1992) "The Phylogeny of Marine and 
Freshwater Caulobacters Reflects Their Habitat" J- 
Bacterid. 174: 2193-2198). DNA probing of Southern blots 
using the S-layer gene from r, n-reacentus CB15 identifies 
a single band that is consistent with the presence of a 
cognate gene (see: MacRae, J.D. and, J- Smit . (1991) 
"Characterization of ranlobacters Isolated from Wastewater 
Treatment Systems" Applied and Environmental Microbiology 
57:751-758). Furthermore, antisera raised against the 
S-layer protein of CB15 reacts against the S-layer protein 
of other ranlobacter (see: Walker. S.G. et al (1992) 

[supra]). All S-layer proteins isolated from Caulobacter 
may be substantially purified using the same methods. All 
strains appear to have a polysaccharide species which may 
be required for S-layer attachment (see: Walker, S.G. et al 

(1992) [supra]). 

The S-layers elaborated by freshwater isolates of 
ranlobacter are visibly indistinguishable from the S-layer 
produced by Caulobacter strains CB2 and CB15. The S-layer 
proteins from the latter strains have approximately 100,000 
m.w. although sizes of S-layer proteins from other species 
and strains will vary. The hydrophillic S-layer protein 
has been characterized both structurally and chemically. 
It is composed of ring-like structures spaced at 22 nm 
intervals arranged in a hexagonal manner on the outer 
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membrane. The S- layer is bound to the bacterial surface 
and may be removed by low pH treatment or by treatment with 
a calcium chelator such as EDTA. 

5 The similarity of S-layer proteins in different strains of 
raulobacter permits - the- -use of a^-cloned S -.layer protein 
gene of one Caulobacter strain for retrieval of the 
corresponding gene in other Caulobacter strains (see: 
vValker, S,G. et al (1992) [supra]; and MacRae, J.D. et al 
10 (1991) [supral ) . 

Expression of a heterologous polypeptide as a fusion 
product with the S-layer protein of Caulobactey provides 
advantages not previously seen in systems for production of 

15 recombinant fusion proteins using other organisms such as 
E. coli and Salmonella , All known Caulobacter strains are 
believed to be harmless and are nearly ubiquitous in 
aquatic environments. In contrast, many Salmonella and 
coli strains are pathogens. Consequently, expression and 

2 0 secretion of a heterologous polypeptide using Caylobacter 
as a vehicle has the advantage that the expression system 
will be stable in a variety of outdoor environments and may 
not present problems associated with the use of a 
pathogenic organism. Furthermore, Caulobacter are natural 

25 biofilm forming species and may be adapted for use in fixed 
biofilm bioreators. The quantity of S-layer protein that 
is synthesized and is secreted by Caulobacter is high, 
reaching 12% of the cell protein. 



30 



There is an existing need to produce pure proteins and 
peptides in an economical manner and in a manner that 
minimizes or simplifies the purification steps needed after 
fermentation. Key commercial areas include the production 
of recombinant human and animal therapeutic antibiotic and 
35 vaccine peptides, industrial enzymes, protein polymers, and 
antibacterial enzymes for foodstuffs. Many of these 
commercial applications require low production costs and 
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there are few expression systems available that can meet 
such cost restraints. In addition, there are numerous 
research applications where rapid methods to produce and 
purify proteins are needed to facilitate the discoveiry 
stage. This is especially true where there is a desire to 
express a large number of proteins with unknown function 
(from a collections of cloned cDNA's, for example) or a 
large number of variants of a single protein, (for example, 
resulting from site directed mutagenesis) in a search for 
variants with improved properties. 

Generally, proteins must be secreted to be produced at low 
cost. The primary reason is the much reduced cost of 
purification of the target protein from cell material. 
However, even for secreted proteins, simple methods of 
separating the product from spent culture and cells are 
important for cost reduction and ease of use. 

International patent application published as WO S7/34 000 
on September 18, 1997 describes the expression and 
secretion of recombinant proteins from Caulobacter in which 
the recombinant protein is a fusion of all or part of 
Paulobacter S- layer protein with a heterologous protein of 
interest (also see: Bingle, W.H., et ^1 1997- "Linker 
Mutagenesis of the Caulobacter us S-layer protein: Toward 
a Definition of an N-terminal Anchoring Region and a 
C-terminal Secretion Signal and the Potential for 
Heterologous Protein Secretion" . J - Bacterid . 

179:601-611) . 

The Caulobacter S-layer secretion apparatus is in the 
^^^.gg^^ of "Type 1" secretion usually found in pathogenic 
bacteria and noted for its ability to secrete a wide 
variety of proteins including large and hydrophillic 
proteins. The raulobacter protein secretion system is 
particularly useful to secrete recombinant proteins- 
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The raulobacter S- layer Type 1 secretion pathway requires 
only a C-terrainal secretion signal, typically comprising 
about 200 amino acids at the end of the protein. The 
export mechanism is capable of tolerating a wide variety of 
foreign proteins. Recombinant proteins may be conveniently- 
produced as fusion—proteins with the target protein being 
fused to the C-terminal secretion signal. Depending on the 
application, it may be desirable to remove the secretion 
signal following secretion. Not removing the secretion 
signal may be an approach suitable for many subunit vaccine 
applications, where the remaining S- layer protein serves as 
a carrier, 

A unique and desirable feature of fusion proteins produced 
by the raulobacter S-layer protein secretion system is that 
they form insoluble aggregates in the culture medium. This 
is apparently a consequence of the S-layer sequences 
associated with secretion signal and reflects the fact that 
the protein normally self -assembles into a two dimensional 
crystalline layer on the bacterium's surface. These 
aggregates are visible to the naked eye and are readily 
collected by simple filtration. With simple water wash 
steps, residual bacterial cells are readily flushed away. 
It is routinely possible to achieve a protein purity of 90% 
or better with this simple purification procedure, 

DESCRIPTION OF THE PRIOR ART 

Most current protein purification systems for recombinant 
proteins produced by bacteria rely upon an affinity matrix 
to achieve separation of the target protein and to 
concentrate the protein for subsequent steps of 
purification. To accomplish this, genes for recombinant 
proteins are commonly constructed so that they contain 
affinity tags, which are protein sequences that will bind 
to an affinity matrix. Some commonly used systems are: 
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1) Glutathione S- transferase (GST) tag, which binds to 
glutathione -sepharose matrices. 

2) Maltose binding protein (MBP) tag, which binds to 
amylose matrices . 

3) Multiple tandem histidine residues (e.g. "His-6") tag, 
which binds to Nickel -derivatized solid matrices. 

4) Protein A tag, which binds to Immunoglobulin 
IgG-derivatized sepharose or comparable matrices . 

Prior art techniques were typically developed so that 
removal of a target protein does not disrupt the tag and 
matrix association. Instead, enzymes that cleave specific 
sequences of amino acids are employed. The enzyme cleavage 
sequence is positioned between the tag and the desired 
recombinant protein and enzymatic cleavage is effected 
directly on the matrix with attached fusion protein. If a 
secretion signal is used, the cleavage site is usually 
positioned such that the secretion signal is separated from 
the target recombinant protein during the cleavage step. 
The matrix is regenerated for re-use only after the target 
recombinant protein has been purified away from the matrix. 
Typical enzymes used in these methods are Factor Xa, 
enterokinase and collagenase. 

Chemical cleavage is generally not used because the 
conditions required for cleavage will disrupt the binding 
of affinity tag and matrix or destroy the matrix. When 
chemical cleavage is used with recombinant fusion proteins 
to cleave target protein from a secretion signal and/or 
affinity tag, solubilization and denaturation processes are 
generally employed. The expectation is that complete or 
nearly complete unfolding of the protein is a prerequisite 
for effective cleavage. 
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Mild-acid cleavage is predicated on the inclusion, by 
happenstance or design, of the acid-sensitive 
aspartate-proline dipeptide at a desired site for cleavage. 
The recombinant fusion protein is exposed to conditions 
that solubilize and/or completely denature the pxotein 
.prior__ to cleavage. The chaotropic agent guanidine 
hydrochloride (used at 6-7 M) is commonly employed to 
denature and solubilize the protein prior to, or at the 
same time as acid treatment. Alternately, high 

concentrations of acids that also serve as solubilizing 
agents (as examples: 70-90% formic acid, acetic acid [10%] 
pyridine, or relatively high concentrations of HCL (60 mM 
or more) are employed. Because such conditions would 
disrupt a tag/affinity matrix association, direct cleavage 
of an affinity tag from the target protein while a protein 
remains associated with an affinity matrix is not 
attempted. 

SUMMARY OF INVENTION 

This invention is based on the unexpected discovery that 
recombinant fusion proteins produced by the Caulolp^ctey 
S- layer protein secretion system can be cleaved under 
mild-acid conditions without solubilization of the fusion 
protein being required. This cleavage may be accomplished 
when the fusion protein is present as the insoluble 
aggregate typically formed by Caulobacter S- layer protein. 
Cleavage occurs at aspartate-protein dipeptides which may 
be in the heterologous protein portion of the fusion 
protein or native to the Caulobacter S- layer portion. The 
dipeptide may also be placed at a desired location for 
cleavage by engineering DNA encoding the fusion protein to 
express the dipeptide at the desired location. Typically, 
the desired location for cleavage will be at or near the 
jionction of the heterologous (target) protein and the 
Caulobacter S- layer portion comprising the Caulobacter 
secretion signal such that a cleavage product will be the 
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target protein in its entirety and preferably free of 
extraneous amino acids. 

The current invention makes it possible to cleave a 
5 heterologous (target) protein from the S- layer protein 
portion using only mild-acid conditions, even while the 
fusion protein is in an aggregated form. These cleavage 
conditions do not result in significant solubilization of 
the S- layer protein portion. In essence, the S- layer 
10 aggregation phenomenon functions like an affinity matrix. 

This invention provides a method of cleaving a fusion 
protein consisting of a first component comprising all or 
part of a Caulobacter S- layer protein including a 

15 raulobacter C- terminal secretion signal, and a second 
component comprising a heterologous polypeptide expressed 
and secreted from Caulobacter wherein the fusion protein 
comprises at least one aspartate-proline dipeptide, and 
wherein the method comprises the step of combining said 

20 fusion protein with an acid solution of a strength 
insufficient to solubilize the fusion protein for a time 
sufficient for cleavage of the fusion protein at said 
aspartate-proline. The acid solution may have a pH of from 
about 1.5 to about 2.5, preferably about 1.65 - 2.35. 

25 Preferred pH conditions may be achieved using an acid 
equivalent in the range of about 5 to about 20 mM HCL. The 
method is typically carried out at a temperature in the 
range of approximately room temperature to about 50*C. 

30 This invention also provides a method of preparing a DNA 
construct suitable for expression of a fusion protein 
suitable for use in the method of this invention, 
comprising joining an upstream DNA segment comprising DNA 
for a heterologous protein of interest and a downstream DNA 

35 segment for a Caulobacter C-terminal secretion signal which 
does not encode an aspartate-proline dipeptide, wherein the 
upstream segment comprises a sequence encoding an 
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aspartate-proline dipeptide at or near a junction between 
said upstream and downstream segments. 

This invention also provides a method of preparing a fusion 
protein, comprising the steps of expressing a DNA construct 
as described above in Caulobacter and recovering said 
fusion protein once secreted by the Caulobacter. 

Once cleavage is accomplished according to this invention, 
the S-layer portion comprising the Caulobacter eecretion 
signal may remain as an insoluble aggregate. If the target 
protein is soluble, the S-layer portion may be easily 
separated from the target recombinant protein by simple 
centrifugation or filtration methods. Thus the system 
functions in a manner analogous to a Tag/affinity matrix 
system except that here, the affinity "tag" is the means of 
producing the insoluble matrix. In addition, this "matrix" 
is resistant to the effects of the acid treatment, allowing 
direct cleavage of the target recombinant protein. In this 
way, a very inexpensive chemical cleavage method can be 
employed to economically retrieve recombinant proteins from 
a bacterial fusion protein. In contrast to the cost of 
most affinity matrices, there is little expense associated 
with the use of the S-layer secretion signal as the matrix 
as it is simply a part of the fermentation/secretion 
process . 

DETAILED DESCRIPTION OF EMBODIMENTS OF THE INVENTION 

Production of Recombinant Fusion Proteins Using the 
Caulobacter S-layer Secretion System 

Proteins may be produced using the Caulobacter S-layer Type 
1 secretion pathway which requires only the C- terminal 
secretion signal of the Caulobacter. This signal is the 
C-terminal portion of the S-layer protein, which typically 
comprises about 200 amino acids. (See: Bingle, et al (1997) 
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[supra] ; and, WO 97/34000) , Additional Caulobacter S- layer 
gene sequence upstream from the secretion signal may also 
be present and is desirable to contribute to aggregate 
formation of the secreted protein. The additional 
5 Caulobacter sequence may constitute most or all of the 
remainder of the S- layer protein. Typically, the aggregate 
forms as loose, gel -like clumps of pure protein that can be 
readily retrieved and separated from the bacteria by simple 
filtration, 

10 

Standard techniques (such as methods described in WO 
97/34000) may be used to identify the amount of the 
C-terminal portion of a particular Caulobacter S-layer 
protein which functions as the secretion signal. 

15 

The creation of fusion proteins is commonly done by 
preparing the target gene DNA and fusing it in- frame with 
the C-terminal region of the S-layer gene. There are 
numerous possible methods; some examples are: 

20 

1. Oligonucleotide Chemical Synthesis, this involves the 
design of complementary single strands, complete with 
desirable restriction endonuclease cut sites at the ends, 
chemical synthesis of the strands followed by annealing, 

25 cloning into a plasmid vector, juxtaposed to an appropriate 
portion of the C-terminal region of the S-layer gene, 

2. Production of the Target Gene DNA by Polymerase Chain 
Reaction (PGR) Amplification of a Target Sequence, In this 

3 0 case, appropriate in-frame restriction sites are 
incorporated into the short oligonucleotides used for 
amplification of a target sequence, such that the final PGR 
product can be treated with the appropriate restriction 
enzymes (to create the restriction site "sticky ends"), 

35 followed by cloning into a plasmid vector, juxtaposed to an 
appropriate portion of the C-terminal region of the S-layer 
gene . 
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3 . Adapting Restriction Endonuclease Cleavag Sit s that 
are Native to a Target Protein Gene Sequence for Fusion to 
the DNA Coding for the C-termiixal S-lay r Secretion Signal 
to Accon^lieh In- frame Expression of a Chimeric Protein. 

This can be accomplished by direct ligation (although it is 
uncommon that an appropriate match will occur) , or the use 
of adapter sequences or methods involving blunting of a 
restriction site and subsequent blunt-end ligation to 
change expression reading frame or join unlike restriction 
site sticky ends. 

There will be numerous convenient sites for fusion with the 
C- terminal regions of the S- layer that lead to the 
successful expression, secretion and aggregation of a 
recombinant fusion protein. Some example positions are at 
or near the DNA sites corresponding to amino acids 622, 
6 90, 784, 892 and 907 of the r r;T-eseentus S- layer gene 
(see: Appendix 1 and, WO 97/34000). Other sites of fusion 
with the S- layer gene may also be employed. Most often a 
plasmid vector is designed such that the C- terminal gene 
segment is resident on a plasmid with appropriate 
restriction sites placed at the N- terminal junction of the 
S-layer fragment. Target recombinant protein gene segments 
are then cloned into those restriction sites. It is 
typical to prepare initial plasmid constructs that are 
replicated in E.coli . After a construct is produced, it is 
typically transferred to a broad host range plasmid which 
can then be introduced into the appropriate Caulobacter 
strain by electroporation. Suitable broad host range 
plasmids can be constructed from (but are not limited to) 
the incQ, IncW and incPl plasmid incompatibility groups. 

The introduction of the aspartate -proline (Asp-Pro) 
dipeptide at the appropriate site in the fusion protein can 
be done in several ways. Some examples are: 
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1. Incorporating a DNA sequence necessary to express the 
Asp- Pro dipeptide into the oligonucleotides used to prepare 
the target sequence, either by oligonucleotide synthesis or 
PGR methods . 

2. Preparing a DNA segment with appropriate restriction 
sites at the termini so that an Asp-Pro dipeptide can be 
introduced (most often at the junction between S- layer and 
target gene) after a fusion recombinant S- layer gene has 
been made . 

3. Use of a native Asp-Pro dipeptide in either the target 
DNA or the S-layer segment. For example, an Asp-Pro 
dipeptide is located at amino acids 6 92 and 6 93 of the 
r-r-^scentus S-layer gene (see: Appendix 1 and. WO 97/34000) 
and is suitable for fusions made at the amino acid site. 

The methods described above are not the only methods of 
creating and expressing the fusion recombinant S-layer 
proteins, nor is it essential to have the engineered genes 
resident on a plasmid. For example, the expressed gene may 
be introduced into the chromosome (using well-known gene 
insertion or replacement techniques) and still achieve 
secretion of the recombinant proteins (see WO 97/34000) . 
In some cases it may be desirable to produce recombinant 
fusion proteins as insertions of heterologous DNA in the 
middle of the S-layer gene. In such a case, Asp-Pro 
dipeptide sequences would be engineered at the N and 
C- termini of the target peptide. 

All possible codon combinations for Asp-Pro will work but 
the CCA codon for proline is not preferred due to a likely 
low amount of the corresponding tKNA in Caulobacter . The 
following is an approximate usage table for C. crescentus . 
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Caulobacter crescentus Cjgdon ^^as* Tabl« 
[Amino ?!c!d! [Triplet Code] [Frequency Per Thousand] 



Phe UUy 
Phe UUC 
Leu UUA 
Leu UUG 

Leu CUU 
Leu cue 
Leu CUA 
Leu CUG 

lleAUU 
tie AUC 
"e AUA 
Met AUG 

Val GUU 
Val GUC 
val GUA 
Val GUG 



2.5 
27.0 
0.0 
4.4 


Ser UCU 
Ser UCC 
SerCA 
Ser UCG 


1.2 
8.5 
1.2 
25.7 


Try UAU 
Try UAC^ ^ 
STOP UAA 
STOP UAG 


6.6 
9.6 
0.8 
0.6 


Cys UGU 
Cys UGC 
Cvs UGA 
STOP UGG 


0.6 
5.5 
1.6 
7.2 


4.4 
15.7 
1.1 
72.3 


Pro CCC 
Pro CCA 
Pro CCG 


2.5 
15.5 
0.9 
27.1 


His CAU 
His CAC 
GInCAA 
Gin CAG 


3.2 
12.2 
3.7 
30.2 


ArgCGU 
ArgCGC 
ArgCGA 
Arg CGG 


7.6 
44.7 
3.0 
12-1 


49.0 

0.3 

25.7 


ThrACU 
ThrACC 
ThrACA 
ThrACG 


1.2 
37,3 
0.8 
16.8 


Asn AAU 
Asn AAC 
Lys AAA 
LysAAG 


4.1 
23.8 
2.7 
37.9 


Ser AGU 
Ser AGC 
Arg AGA 
Arg AGG 


0-8 
14-9 
0.4 
1.1 


5.4 
42.7 
1.0 
30.7 


Ala GCU 
Ala GCC 
Ala GCA 
Ala GCG 


9.5 
84.1 
2.2 
36.7 


Asp GAU 
Asp GAC 
GluGAA 
GluGAG 


11.1 
48.5 
20.5 
45.4 


Gly GGU 
Gly GGC 
Gly GGA 
Gly GGG 


9.5 
64.8 

2.3 
7.7 



20 



25 



30 



Large quantities (eg. 12% of total cell protein/3% of input 
organic carbon) of a wide range of proteins can be 
produced, with yields in the order of 250 mg/liter of batch 
culture. Fusion proteins with 35 kDa of target peptide are 
secreted with little difficulty., although proteins with 
multiple cysteines may be more difficult to express. 
Post-expression glycosylation of proteins does not occur, 
an advantage for most peptide expression applications. 



35 
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HoBt Expression Strains 

For secretion of recombinant fusion S-layer proteins, the 
r- f^nlQbacter strain will preferably have lost the ability to 
produce a native S-layer protein, while retaining a fully 
functional S-layer protein secretion apparatus. This may 
be done by screening for mutants that have spontaneously 
become S-layer protein negative; or, by directed genetic 
manipulation, such as (but not limited to) the insertion of 
a drug resistance cassette in the middle of the S-layer 
gene or the substitution of a version of the S-layer gene 
which has had a sizable internal region deleted from the 
gene (see: Bingle et al 1997^ [supra]; Single et al 1997* 
"Cell Surface Display of a Pseudono monas aeruQenosa PAK 
Pilin Peptide with the Paracrystalline Layer of Cai;J.ob^cter 
crescentus" Molec. Microbiol. 26:277-288; and, Edwards and 
Smit (19 91) "A Transducing Bacteriophage for Caulobacter 
us Uses the Paracrystalline Surface Layer Protein as a 
Receptor" J. Bacteriol, 173: 5568-5572). In the case of a 
genetic manipulation, a common method for producing such 
strains is to do appropriate modification of a copy of the 
S-layer gene while on a plasmid and then to use well known 
gene replacement methods to substitute the modified gene 
for the native gene in the Caulobacter chromosome (see: 
Edwards and Smit (1991) [supra]). 

In the rare case that an entire S-layer gene is used for 
production of a recombinant protein (via insertion of a 
target sequence) , strains defective in the production of 
the lipopolysacharide (LPS) used for S-layer attachment to 
the bacterial surface can be used. These can be prepared 
by forcing Caulobacter to grow without exogenous calcium. 
Under these conditions mutants arise that are uniformly 
defective in producing a proficient version of the S-layer 
LPS (see: Walker, S.G. et al (1994) "Characteristics of 
Mni-ants of rani r>hagter crescentus Defective in Surface 
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Attachment of the Paracrystaline Layer" J. Bacteriol, 176: 
6312-6323) , 

All Caulobacter S- layer producing strains are suitable for 
this technology. One may either isolate the S- layer gene 
from a strain (using homology between Caulobacter S- layers 
to design probes to detect and clone the S- layer genes) and 
adapt the C- terminal region for recombinant protein 
expression in a manner similar to that done for Cj. 
r-rescentus strains (see: MacRae and Smit (1991) [^upra] , 
and Walker, S-G. et al (1992) [supra]). Alternatively, one 
may use recombinant fusion S- layer genes produced with 
C. crescentus S- layer gene and express them in alternate 
Caulobacter hosts. 

Freshwater Caulobacter producing S- layers may be readily 
detected by negative stain transmission electron microscopy 
techniques. Caulobacter may be isolated using the methods 
outlined by MacRae and Smit (1991) [supra] , which take 
advantage of the fact that Caulobacter can tolerate periods 
of starvation while other soil and water bacteria may not 
and that they all produce a distinctive stalk structure, 
visible by light microscopy (using either phase contrast or 
standard dye staining methods) . Once Caulobacter strains 
are isolated in a typical procedure, colonies are suspended 
in 2% ammonium molybdate negative stain and applied to 
plastic-filmed, carbon- stabilized 300 or 400 mesh copper or 
nickel grids and examined in a transmission electron 
microscope at 60 kilovolt accelerating voltage (see: Smit, 
J. (1986) "Protein Surface Layers of Bacteria", in Outer 
Membranes as Model Systems , (M. Inouge, ed. J.Wiley & Sons, 
at p. 343-376) . S-layers are seen as two-dimensional 
geometric patterns most readily on those cells in a colony 
that have lysed and released their internal contents. 
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Recomblxiant Prot in Purification 

Secreted proteins are separated and "shed" into the culture 
media as a macroscopic precipitate (the aggregate described 
above) . The shedding phenomenon is a consequence of the 
absence of the N-termirial region of the S -"layer protein in 
the expressed recombinant protein, or the loss of the 
lipopolysaccharide species used for S- layer attachment by 
the Caulobacter (see: Walker, S.G. et al (1994) [aupra]). 

The loose gel like clumps of aggregate may be readily 
separated from a soluble cleaved target protein by any 
suitable techniques such as filtration of centrifugation. 
If the target protein is insoluble once cleaved, it may 
then be convenient to solubilize one or both of the 
proteins (for example in 8M urea or 6M quanidine HCL) and 
separate by chromatography. In this way, only 2 species of 
protein need to be separated. 

Cleavage of Fusion Proteins 

Conditions for cleavage at aspartate - proline sites are 
described in Current Protocols in Molecula r Biology (supp. 
28; chapter 16.4) John Wiley & Sons Inc. 1994, and in 
Landon, M. "Cleavage at Aspartyl - Prolyl Bonds" in Methods 
•in T^ nzymologv (1977) 47: 145-149. These references suggest 
that significant variability of cleavage conditions exist 
for different proteins and that cleavage might occur in 
some instances without first denaturing or solubilizing the 
protein. However, in practice the latter circumstances are 
rare and proteins to be subjected to acid cleavage at 
Asp- Pro dipeptides are usually solubilized to a state where 
there is no visible turbidity. The solubilized protein 
will normally not pellet when centrifuged at 100,000 x g 
for 1 hour. It is now shown that mild-acid conditions may 
be used for cleavage of aspartate -proline sites in 
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Caulobacter S- layer fusion proteins without placing the 
protein in a solubilized state as described above. 

In the method of this invention, conditions are adjusted to 
avoid destruction of the target protein or solubilization 
of the aggregate containing the S-layer secretion signal. 
Excess acid or too high a temperature may increase the 
occurrence over time of random cleavages along the length 
of the fusion protein, which is to be avoided since such 
random cleavages may lead no undersized fragmentation of 
the fusion protein or solubilization of the aggregated 
S-layer portion. 

Good yields of target protein with minimum random breaks in 
the fusion protein may generally be achieved by using from 
5-20 mM HCL (or its equivalent while employing another 
acid) . The respective pH of these conditions (unbuffered 
acid solution) is from about 2 . 3 to about 1.69. Time and 
temperature is preferably adjusted to achieve the desired 
cleavage while minimizing random breaks. For example, 
temperature may range from room temperature to about 50° C. 
Time of treatment may range from about 12-72 hours. Time 
or temperature outside of these ranges is permissable 
depending upon the strength of the acid and the accepted 
yield. Generally, lower yields are obtained with less acid 
strength, less time or lower temperatures . 

In the following examples, efficiency of cleavage in the 
order of 40-80% is achieved using conditions similar to the 
following alternatives: 

- 5 mM HGL at 50° C. for 48-72 hours 

- 20 mM HCL at 30° C. for 48-72 hours. 

Conditions in excess of the aforementioned values may be 
employed in some cases with the possibility of random 
breaks increasing particularly with acid strength or 
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temperature. In the following examples, significant random 
cleavage occurred with 50 mM HCL at 50° C. after 48 hours. 

Any acid may be employed in this invention which is 
normally used in solutions to which proteins are exposed. 
Acids which have a deleterious effect on proteins under 
dilute conditions should be avoided. For example, HCL or 
an equivalent amount of H^SO, may be used in this invention 
but oxidizing acids such as nitric acid may not be 
suitable . 

Example 1 . Cleavage of artificial silk protein sequences 
from a secretion signal containing a native 
aspartate -proline cleavage site. 

An artificial protein sequence resembling spider silk was 
constructed by synthesis of partially overlapping and 
complementing oligomers of DNA, which were then completed 
to a full duplex DNA with Taql polymerase extension, to 
create a sequence that coded for 97 amino acids. The 
resulting DNA sequence and corresponding amino acid 
sequence is shown in Appendix 2 . 

The DNA sequence shown in Appendix 2 was cloned into a gene 
carrier sequence residing in a pUC8 plasmid cloning vector. 
The gene segment carrier had BamHl restriction sites at 
each end and an internal Bglll site. This combination of 
restrictions sites allowed the production of multimers of 
the above sequence, relying on the fact that BamHl sticky 
ends will ligate into Bglll sticky end, with the loss of 
both restriction sites. Thus one copy of the silk-like 
sequence within the gene segment carrier can be put inside 
a second copy of the same to produce a dimer. Using this 
principle, an 8X repeat was produced, fused to DNA encoding 
the S-layer secretion signal corresponding to the 
C- terminal portion of the crescentus S-layer protein 
from about amino acid 690 onwards (see: Appendix 1) . This 
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fusion protein gene was introduced into strain CB2A on a 
broad host range plasmid vector. The 8x multimer appeared 
to be unstable, resulting in recombination events that 
reduced the 8X tnultimer to a 3x size. The 3 fold repeat of 
5 the above 97 amino acid sequence, fused Lo the S- layer 
secretion signal was secreted. Protein was collected and 
subjected to treatment with 5mM HCL for 2 days at 50<* C. 
The result was the liberation of about 80% of soluble 
silk- like polymer which was readily separated by filtration 
10 from the S-layer protein which remained completely 
aggregated under these conditions. Cleavage occurred at 
native aspartate-proline dimer in the C^uJ^ob^cter; S-layer 
signal region (see: Appendix 1, amino acids numbered 
692-693) . 

15 

Example 2 . Cleavage of the salmonid virus Infectious 
Pancreatic Necrosis Virus (IPNV) surface glycoprotein 
candidate vaccine sequence from an S-layer secretion signal 
containing a native aspartate -proline site. 

20 

The surface glycoprotein of the IPNV strain is a vaccine 
candidate. For this example and Example 4, the sequence of 
the first 257 amino acids of the mature protein and the 
corresponding DNA sequence is shown in Appendix 3 , 

25 

DNA encoding a segment of the major surface glycoprotein 
gene of IPNV specifying amino acids 145-257 of the protein 
was fused to DNA sequence specifying two putative T-cell 
activating epitopes: MVF (SEQ ID No:l; LSEIKGVIVHRLEGV, 

30 derived from Measles Virus protein F) and P2 (SEQ ID No : 2 ; 
QYIKANSKFIGITEL, derived from tetanus toxoid protein) . The 
T-cell epitopes were positioned on the C- terminal end of 
the IPNV sequence. This chimeric protein was in turn fused 
in frame with the C-crescentus S-layer gene at about amino 

3 5 acid 690 position of the gene and introduced into 
Caulobacter on a broad host range plasmid vector. The 
resulting secreted protein was collected and treated with 
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5 mM HCL for 2 days at 50" C. Cleavage occurred at the 
native aspartate-proline dimer described in Example 1. The 
result was the liberation of about 75% of soluble vaccine 
candidate chimeric protein from the S- layer secretion 
signal which remained aggregated. 

Example 3. Cleavage of eegments of an E. coli type I pilus 
tip eubunit from an S- layer secretion signal containing a 
native aspartate -proline cleavage site. 

The FimH gene product is the tip pilus subunit of the 
coli strains involved with urinary tract infections. Two 
segments, T3 , specifying the first 145 amino acids of the 
mature peptide and T7, specifying the entire 258 amino 
acids of the mature peptide were fused to the S- layer 
secretion signal at about amino acid 6 90 of the S- layer 
sequence. The T3 and T7 sequences are shown in Appendix 4, 

The fusion protein genes were introduced into strain CB2A 
on a broad host range plasmid vector. In both cases the 
resulting secreted protein was collected and treated with 
5 mM HCL for 2 days at 50** C. In both cases, the result was 
the liberation of about 50% of soluble vaccine candidate 
chimeric protein from the S-layer secretion signal which 
remained aggregated. Cleavage occurred at the native 
aspartate-proline dimer described in Example 1. 

Example 4 , Cleavage of the ealmonid virus IPNV surface 
glycoprotein candidate vaccine sequence from an S-layer 
secretion signal containing an introduced aspartate-proline 
cleavage site. 

A segment of the major surface glycoprotein gene of IPNV 
specifying amino acids 1-257 of the protein shown in 
Appendix 4 was fused to a DNA sequence specifying a peptide 
containing an aspartate-proline dipeptide (SEQ ID No: 3; 
SPLGPAGDPEAS) such that the aspartate-proline dipeptide was 
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positioned very near the C-terminus of the chimeric 
protein. This chimeric protein was in turn fused in frame 
with the C, crescentus S- layer gene at about amino acid 784 
position of the gene and introduced in strain CB2A on a 
broad host range plasmid vector. The resulting secreted 
protein was collected and treated with 5 mM HCL for 2 days 
at 5 0° C. Cleavage occurred at the introduced 

aspartate-proline dipeptide. The result was the liberation 
of about 4 0% of insoluble vaccine candidate chimeric 
protein from the S- layer secretion signal which remained 
aggregated. 

Lengthy DNA and amino acid sequences referred to in this 
specification are set out in the following appendixes. 
Appendix 1 sets out the complete nucleotide sequence of the 
r. crescentus S- layer gene (SEQ ID No: 4) with the upstream 
sequence including the -35 and -10 sites of the promoter 
region and the Shine Dalgarno sequence. The start codon is 
at nucleotide 101 and the coding sequence run to and 
includes nucleotide 3179, The amino acid sequence of the 
C. crescentus S-layer protein (SEQ ID No: 5) included in 
Appendix 1 is predicted from the DNA sequence. Appendix 2 
sets out the artificial spider silk DNA sequence (SEQ ID 
No: 6) used in Example 1 and the corresponding amino acid 
sequence (SEQ ID No. 7) , Appendix 3 sets out the DNA 
sequence (SEQ ID No: 8) and corresponding amino acid 
sequence (SEQ ID No: 9) of the first 257 amino acids of 
IPNV as described in Examples 2 and 4. Appendix 4 sets out 
the T3 protein sequence (SEQ ID No: 10) and the T7 protein 
sequence (SEQ ID No: 11) as described in Example 3. 

All publications, patents and patent applications referred 
to herein are hereby incorporated by reference. While this 
invention has been described according to particular 
embodiments and by reference to certain examples, it will 
be apparent to those of skill in the art that variations 
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and modifications of the invention as described herein fall 
within the spirit and scope of the attached claims. 
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Appendix 1 



GCTATTGTCG 


ACGTATGACG 


TTTGCTCTAT 


AGCCATCGCT 


GCTCCCATGC 


GCGCCACTCG 


60 


GTCGCAGGGG 


GTGTGGGATT 


TTTTTTGGGA 


GACAATCCTC 


ATGGCCTATA 


CGACGGCCCA 


120 


GTTGGTGACT 


OCGTACACCA 


ACGCCAACCT 


CGGCAAGGCG 


CCTGACGCCG 


CCACCACGCT 


180 


GACGCTCGAC 


GCGTACGCGA 


CTCAAACCCA 


GACGGGCGGC 


CTCTCGGACG 


CCGCTGCGCT 


240 


GACCAACACC 


CTOAAGCTGG 


TCAACAGCAC 


GACGGCTGTT 


GCCATCCAGA 


CCTACCAGTT 


300 


CTTCACCGGC 


GTTGCCCCGT 


CGGCCGCTGG 


TCTGGACTTC 


CTGGTCGACT 


CGACCACCAA 


360 


CACCAACGAC 


CTGAACGACG 


CGTACTACTC 


GAAGTTCGCT 


CAGGAAAACC 


GCTTCATCAA 


420 


CTTCTCGATC 


AACCTGGCCA 


CGGGCGCCGG 


CGCCGGCGCG 


ACGGCTTTCG 


CCGCCGCCTA 


480 


CACXSGGCGTT 


TCGTACGCCC 


AGACGGTCGC 


CACCGCCTAT 


GACAAGATCA 


TCGGCAACGC 


540 


CGTCGCGACC 


GCCGCTGGCG 


TCGACGTCGC 


GGCCGCCGTG 


GCTTTCCTGA 


GCCGCCAGGC 


600 


CAACATCGAC 


TACCTGACCG 


CCTTCGTGCG 


CGCCAACACG 


CCGTTCACGG 


CCGCTGCCGA 


660 


CATCGATCTG 


GCCGTCAAGG 


CCGCCCTGAT 


CGGCACCATC 


CTGAACGCCG 


CCACGGTGTC 


720 


GGGCATCGGT GGTTACGCGA 


CCGCCACGGC 


CGCGATGATC 


AACGACCTGT 


CGGACGGCGC 


780 


CCTGTCGACC 


GACAACGCGG 


CTGGCGTGAA 


CCTGTTCACC 


GCCTATCCGT 


CGTCGGGCGT 


840 


GTCGGGTTCG ACCCTCTCGC 


TGACCACCGG 


CACCGACACC 


CTGACGGGCA 


CCGCCAACAA 


900 


CGACACGTTC 


CTTGCGOGTG AAGTCGCCGG 


CGCTGCGACC 


CTGACCGTTG 


GCGACACCCT 


960 


GAGCGGCGGT 


GCTGGCACCG 


ACGTCCTGAA 


CTGGGTGCAA GCTGCTGCGG 


TTACGGCTCT 


1020 


GCCGACCGGC 


GTGACGATCT 


CGGGCATCX3A AACGATGAAC GTGACGTCGG 


GCGCTGCGAT 


1080 


CACCCTGAAC 


ACGTCTTCGG 


GCGTGACGGG 


TCTGACCGCC 


CTGAACACCA 


ACACCAGCGG 


1140 


CGCGGCTCAA 


ACCGTCACCG 


CCGGCGCTGG 


CCAGAACCTG 


ACCGCCACGA 


CCGCCGCTCA 


1200 


AGCCGCGAAC AACGTCGCCG 


TCGACGGGCG 


CGCCAACGTC 


ACCGTCGCCT 


CGACOGGCGT 


1260 


GACCTCGGOC 


AGGACCACGG 


TCGGCGCCAA 


CTCGGCCGCT 


TCGGGCACCG 


TGTCGGTGAG 


1320 


CGTCGCGAAC 


TCGAGCACGA 


CCACCACGGG 


CGCTATCGCC 


GTGACCGGTG 


GTACGGCCGT 


1380 


GACCGTGGCT 


CAAACGGCCG 


GCAACGCCGT 


GAACACCACG 


TTGACGCAAG 


CCGACGTGAC 


1440 


CGTGACCGGT AACTCCAGCA 


CCACGGCCGT 


GACGGTCACC 


CAAACGGCCG 


CCGCCACGGC 


1500 


CGGCGCTACG GTCGCCGGTC GCGTCAACGG 


CGCTGTGACG ATCACCGACT 


CTGCCGCCGC 


1560 


CTCX3GCCACG ACCGCCGGCA 


AGATCGCCAC 


GGTCACCCTG 


GGCAGCTTCG 


GCGCCGCCAC 


1620 


GATCGACTCG AGCGCTCTGA 


CGACCGTCAA 


CCTGTCGGGC 


ACGGGCACCT 


CGCTCGGCAT 


1680 
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CGGCCGCGGC 


GCTCTGACCG 


CCACGCCGAC 


CGCCAACACC 


CTGACCCTGA 


ACGTCAATQG 


1740 


TCTGACGACG ACOGGCGCGA 


TCACGGACTC 


GGAAGCGGCT 


GCTGACGATG 


GTTTCACCAC 


1800 


CATCAACATC 


GCTGGTTCGA 


CCOCCTCTTC 


GACGATCGCC 


AGCCTGGTGG 


CCOCCGACGC 


1860 


GACGACCCTG 


AACATCTCGG 


GCGACGCTCG 


CGTCACGATC 


ACCTCGCACA 


CCGCTGCCGC 


1920 


CCTGACGGGC 


ATCACGGTGA 


CCAACAGCGT 


TGGTGCGACC 


CTCGGCGCCG 


AACTGGCGAC 


1980 


CGGTCTGOTC 


TTCACGGGCG 


GCGCTGGCCG 


TGACTCGATC 


CTGCTGGGCG 


CCACGACCAA 


2040 


GGCGATCGTC 


ATGGGCGCCG 


GCGACGACAC 


CGTCACCGTC 


AGCTCGGCGA 


CCCTGGGCGC 


2100 


TGGTGGTTCG GTCAACGGCG 


GCGACGGCAC 


CGACGTTCTG 


GTGGCCAACG 


TCAACGGTTC 


2160 


GTCGTTCAGC 


GCTGACCCGG 


CCTTCGGCGG 


CTTCGAAACC 


CTCCGCGTCG 


CTGGCGCGGC 


2220 


GGCTCAAGGC 


TCGCACAACG 


CCAACGGCTT 


CACGGCTCTG 


CAACTGGGCG 


CGACGGCGGG 


2280 


TGCGACGACC 


TTCACCAACG 


TTGCGGTGAA 


TGTCGGCCTG 


ACCGTTCTGG 


CGGCTCCGAC 


2340 


CGGTACGACG ACCGTGACCC 


TGGCCAACGC 


CACGGGCACC 


TCGGACGTGT 


TCAACCTGAC 


2400 


CCTGTCGTCC 


TCGGCCGCTC 


TGGCCGCTGG 


TACGGTTGCG 


CTGGCTGGCG 


TCGAGACGGT 


2460 


GAACATCGCC 


GCCACCGACA 


CCAACACGAC 


CGCTCACGTC 


GACACGCTGA 


CGCTGCAAGC 


2520 


CACCTCGGCC 


AAGTCGATCG 


TGGTGACGGG 


CAACGCCGGT 


CTGAACCTGA 


CCAACACCGG 


2580 


CAACACX3GCT 


GTCACCAGCT 


TCGACGCCAG 


CGCCGTCACC 


GGCACGGCTC 


CGGCTGTGAC 


2640 


CTTCGTGTCG 


GCCAACACCA 


CGGTGGGTGA 


AGTCGTCACG 


ATCCGCGGCG 


GCGCTGGCGC 


2700 


CGACTCGCTG 


ACCGGTTCGG 


CCACCGCCAA 


TGACACCATC 


ATCGGTGGCG 


CTGGCGCTGA 


2760 


CACCCTGGTC 


TACACCOGCG 


GTACGGACAC 


CTTCACGGGT 


GGCACGGGCG 


CGGATATCTT 


2820 


CGATATCAAC 


GCTATCGGCA 


CCTCGACCGC 


TTTCGTGACG 


ATCACCGACG 


CCGCTGTCGG 


2880 


CGACAAOCTC 


GACCTCGTCG 


GCATCTCGAC 


GAACOGCGCT 


ATCGCTGACG 


GCGCCTTCGG 


2940 


CGCTGCGGTC 


ACCCTGGGCG 


CTGCTGCGAC 


CCTGGCTCAG 


TACCTGGACG 


CTGCTGCTOC 


3000 


CGGCGACGGC AGCGOCACCT 


CGGTTGCCAA 


GTGGTTCCAG 


TTCGGCGGCG 


ACACCTATGT 


3060 


CGTCGTTGAC 


AGCTCGGCTG 


GCGCQACCTT 


CGTCAGCGGC 


GCTGACGCGG 


TGATCAAGCT 


3120 


GACCGGTCTG GTCACGCTGA 


CCACCTCGGC 


CTTCGCCACC 


GAAGTCCTGA 


CGCTCGCCTA 


3180 


AGCGAACGTC 


TGATCCTCGC 


CTAGGCGAOG 


ATCGCTAGAC 


TAAGAGACCC 


CGTCTTCCGA 


3240 


AAGGGAGOCG GGGTCTTTCT 


TATGGGCGCT 


ACGCGCTGGC 


CGGCCTTGCC 


TAGTTCCGGT 


3300 
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Appendix 1 
(cont'd) 

Met Ala Tyr Thr Thr Ala Gin Leu Val Thr Ala Tyr Thr Asn Ala Asn 
1 5 10 15 

Leu Gly Lys Ala Pro Asp Ala Ala Thr Thr Leu Thr Leu Asp Ala Tyr 
20 20 

Ala Thr Gin Thr Gin Thr Gly Gly Leu Scr Asp Ala Ala Ala Leu Thr 
35 40 45 

Aen Thr l.eu Lys Leu Val Asn Ser Thr Thr Ala Val Ala lie Gin Thr 
50 55 €0 

Tyr Gin ^^e Phe Thr Gly Val Ala Pro Ser Ala Ala Gly Leu Asp Phe 
65 70 75 80 

Leu Val Asp Ser Thr Thr Asn Thr Asn Asp Leu Asn Asp Ala Tyr Tyr 
85 90 95 

Ser Lys Phe Ala Gin Glu Asn Arg Phe He Asn Phe Ser He Asn Leu 
100 105 110 

Ala Thr Gly Ala Gly Ala Gly Ala Thr Ala Phe Ala Ala Ala Tyr Thr 
115 120 125 

Glv Val Ser Tyr Ala Gin Thr Val Ala Thr Ala Tyr Asp Lys He He 
^ 130 135 140 

Glv Asn Ala Val Ala Thr Ala Ala Gly Val Asp Val Ala Ala Ala Val 
145 150 155 160 

Ala Phe Leu Ser Axg Gin Ala Asn He Asp Tyr Leu Thr Ala Phe Val 
165 170 175 

Arq Ala Asn Thr Pro Phe Thr Ala Ala Ala Asp He Asp Leu Ala Val 
180 185 190 

Lys Ala Ala Leu He Gly Thr He Leu Asn Ala Ala Thr Val Ser Gly 
195 200 205 

He Gly Gly Tyr Ala Thr Ala Thr Ala Ala Met He Asn Asp Leu Ser 
210 215 220 

Asp Gly Ala Leu Ser Thr Asp Asn Ala Ala Gly Val Asn Leu Phe Thr 
225 230 235 240 

Ala Tyr Pro Ser Ser Gly Val Ser Gly Ser Thr Leu Ser Leu Thr Thr 
245 250 255 

Gly Thr Asp Thr Leu Thr Gly Thr Ala Asn Asn Asp Thr Phe Val Ala 
260 265 270 

Gly Glu Val Ala Gly Ala Ala Thr Leu Thr Val Gly Asp Thr Leu Ser 
275 280 285 

Gly Gly Ala Gly Thr Asp Val Leu Asn Trp Val Gin Ala Ala Ala Val 
290 295 300 

Thr Ala Leu Pro Thr Gly Val Thr He Ser Gly He Glu Thr Met Asn 
305 310 315 320 

Val Thr Ser Gly Ala Ala He Thr Leu Asn Thr Ser Ser Gly Val Thr 
325 330 335 

Gly Leu Thr Ala Leu Asn Thr Asn Thr Ser Gly Ala Ala Gin Thr Val 
340 345 350 
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Thr Ala Gly Ala Gly Gin Asn Leu Thr Ala Thr Thr Ala Ala Gin Ala 
355 360 365 

Ala Aan Asn Val Ala Val Aap Gly Arg Ala Asn Val Thr Val Ala Ser 
370 375 380 

Thr Gly Val Thr Ser Gly Thr Thr Thr Val Gly Ala Asn Ser Ala Ala 
385 390 395 400 

Ser Gly Thr Val Ser Val Ser Val Ala Asn Ser Ser Thr Thr Thr Thr 
405 410 415 

Gly Ala lie Ala Val Thr Gly Gly Thr Ala Val Thr Val Ala Gin Thr 
420 425 430 

Ala Gly Asn Ala Val Asn Thr Thr Leu Thr Gin Ala Asp Val Thr Val 
435 440 445 

Thr Gly Asn Ser Ser Thr Thr Ala Val Thr Val Thr Gin Thr Ala Ala 
450 455 460 

Ala Thr Ala Gly Ala Thr Val Ala Gly Arg Val Asn Gly Ala Val Thr 
465 470 475 480 

He Thr Asp Ser Ala Ala Ala Ser Ala Thr Thr Ala Gly Lys lie Ala 
485 490 495 

Thr Val Thr Leu Gly Ser Phe Gly Ala Ala Thr He Asp Ser Ser Ala 
500 505 510 

Leu Thr Thr Val Asn Leu Ser Gly Thr Gly Thr Ser Leu Gly Tie Gly 
515 520 525 

Arg Gly Ala Leu Thr Ala Thr Pro Thr Ala Asn Thr Leu Thr Leu Asn 
530 535 540 

Val Asn Gly Leu Thr Thr Thr Gly Ala He Thr Asp Ser Glu Ala Ala 
545 550 555 560 

Ala Asp Asp Gly Phe Thr Thr He Asn He Ala Gly Ser Thr Ala Ser 
565 570 575 

Ser Thr He Ala Ser Leu Val Ala Ala Asp Ala Thr Thr Leu Asn He 
580 585 590 

Ser Gly Asp Ala Arg Val Thr He Thr Ser His Thr Ala Ala Ala Leu 
595 600 605 

Thr Gly He Thr Val Thr Asn Ser Val Gly Ala Thr Leu Gly Ala Glu 
610 615 620 

Leu Ala Thr Gly Leu Val Phe Thr Gly Gly Ala Gly Arg Asp Ser He 
625 630 635 640 

Leu Leu Gly Ala Thr Thr Lys Ala He Val Met Gly Ala Gly Asp Asp 
645 650 655 

Thr Val Thr Val Ser Ser Ala Thr Leu Gly Ala Gly Gly Ser Val Asn 
660 665 670 

Gly Gly Asp Gly Thr Asp Val Leu Val Ala Asn Val Asn Gly Ser Ser 
675 680 685 

Phe S r Ala Asp Pro Ala Phe Gly Gly Phe Glu Thr Leu Arg Val Ala 
690 695 700 
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Glv Als Ala Ala Gin Glv Ser His Asn Ala Asn Gly Phe Thr Ala Leu 
705 710 715 720 

Gin Leu Gly Ala Thr Ala Gly Ala Thr Thr Phe Thr Asn Val Ala Val 
725 730 735 

Asn Val Gly Leu Thr Val l^u Ala Ala Pro Thr Gly Thr Thr Thr Val 
740 745 750 

Thr T-ou Ala Asn Ala Thr Gly Thr Ser Asp Val Phe Asn Leu Thr Leu 
755 760 765 

Ser Ser Ser Ala Ala Leu Ala Ala Gly Thr Val Ala Leu Ala Gly Val 
770 775 780 

Glu Thr Val Asn lie Ala Ala Thr Asp Thr Asn Thr Thr Ala His Val 
785 790 795 800 

Aso Thr Leu Thr Leu Gin Ala Thr Ser Ala Lys Ser lie Val Val Thr 
^ 805 810 815 

Glv Asn Ala Gly Leu Asn Leu Thr Asn Thr Gly Asn Thr Ala Val Thr 
^ 820 825 830 

Ser Phe Asp Ala Ser Ala Val Thr Gly Thr Ala Pro Ala Val Thr Phe 
835 840 845 

val Ser Ala Asn Thr Thr Val Gly Glu Val Val Thr He Arg Gly Gly 
850 855 860 

Ala Gly Ala Asp Ser Leu Thr Gly Ser Ala Thr Ala Asn Asp Thr He 
865 870 875 880 

He Gly Gly Ala Gly Ala Asp Thr Leu Val Tyr Thr Gly Gly Thr Asp 
885 890 895 

Thr Phe Thr Gly Gly Thr Gly Ala Asp He Phe Asp He Asn Ala He 
900 905 

Gly Thr Ser Thr Ala Phe Val Thr He Thr Asp Ala Ala Val Gly Asp 
915 920 925 

Lys Leu Asp Leu Val Gly He Ser Thr Asn Gly Ala He Ala Asp Gly 
930 935 940 

Ala Phe Gly Ala Ala Val Thr Leu Gly Ala Ala Ala Thr Leu Ala Gin 
945 950 955 960 

Tyr Leu Asp Ala Ala Ala Ala Gly Asp Gly Ser Gly Thr Ser Val Ala 
^ 965 970 975 

Lys Trp Phe Gin Phe Gly Gly Asp Thr Tyr Val Val Val Asp Ser Ser 
980 985 990 

Ala Gly Ala Thr Phe Val Ser Gly Ala Asp Ala Val He Lys Leu Thr 
995 1000 1005 

Gly Leu Val Thr Leu Thr Thr Ser Ala Phe Ala Thr Glu Val Leu Thr 
1010 1015 1020 

Leu Ala 
1025 



lli 

# 
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GAA TTC AGA TCT CAG GGC GCG GGG CAG GGT GGC TAT GGT GGG CTC GGC 

TCG CAA GGC 

GCT 

EFRSQGAGQGGYGGLGSQGA 

GGC CTG GGT GGC CAG GGC GCT GGC GCG GCC GCG GCC GCT GCG GCC GGT 

GRGGQGAGAAAAAAAGG 

GCT GGC CAG GGC GGG CTG GGC TCG CAG GGC GCC GGC CAA GGC GCT GGC 

GCC GCG GCC 

GCT 

AGQGGLGSQGAGQGAGAAAA 

GCG GCC GGT GGC GCC GGC CAG GGT GGC TAC GGC GGC CTG GGC AGC CAG 
GGC GCC GGT 

AAGGAGQGGYGGLGSQGAGR 

GGC GGT CAG GGC GCC GGT GCC GCG GCC GCT GCG GCC GGT GGC GCT GGG 
CAA GGC GGC TAC 

GGQGAGAAAAAAGGAGQGGY 

GGC GGT CTG GGA TCC 
G G L G S 
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1/1 

atg aac aca aac aag gca acc gca act tac ttg aaa tec att alg ctt cca gag act 
Mel asn thr asn lys aia thr ala thr tyr leu lys ser ite met teu pro glu thr 

i'r/21 

CCS gca bqc ate ccg gac gac ata acg gag aga cac ale tta aaa caa gag acc teg 
tea 

pro ala ser ile pro asp asp ite thr glu arg his ile leu lys gin glu thr ser 

ser 

121/41 

tac aac tta gag gte tec gaa tea gga agt ggc att ctt gtt tgt tte cct ggg gca 

cca . I 

tyr asn leu glu val ser glu ser gly ser gly ile leu val cys phe pro gly ala 

pro 

1S1/61 

ggc tea egg ate ggt gca cac tac aga tgg aat grg aac cag acg ggg ctg gag tic 

gtyser arg ile gly ala his tyr arg trp asn ala asn gin thr gly leu glu phe 

asp 

241/81 

cag tgg ctg gag acg teg cag gac ctg aag aaa gcc ttc aac tac ggg agg dg ate 
tea 

gin trp leu glu thr ser gin asp leu lys lys ala phe asn tyr gly arg leu ile 
ser 

301/101 

agg aaa tac gac alt caa age tec aca eta ccg gcc ggt etc tat get ctg aac ggg 

a^lys tyr asp ile gin ser ser thr leu pro ala gly leu tyr ata leu asn gly 
thr 

361/121 

etc aac get gcc acc ttc gaa ggc agt ctg td gag gtg gag age ctg acc tac aat 
age 

teu asn ala ala thr phe glu gly ser leu ser glu val glu ser leu thr tyr asn 
ser 

421/141 

ctg atg tee eta act acg aac cce cag gac aaa gcc aac aac cag ctg gtg acc aaa 
gga 

leu nnet ser leu thr thr asn pro gin asp lys ala asn asn gin leu val thr lys 
4^1/161 

gtc acc gte ctg aat eta cca aca ggg ttc gac aaa cca tac gtc cgc eta gag gac 
gag 

val thr val teu asn leu pro thr gly phe asp lys pro tyr val arg leu glu asp 



1/181 

aca ccc cag ggt etc cag tea atg aac ggg gcc agg atg agg tgc aca get gca att 
gca 

thr pro gin gly leu gin ser met asn gly ala arg met arg cys thr ala ala tie 
ala 

601/201 

cca egg agg tac gag ate gac etc cca tec caa age eta ccc cce gtt cct gcg aca 
gga 

pro arg arg tyr gtu ile asp leu pro ser gin ser leu pro pro val pro ala thr 
§61/221 

acc etc acc act etc tac gag gga aac gcc gac ate gtc age tec aca aca gtg acg 
gga 

thr leu thr thr leu tyr glu gty asn ala asp ile val ser ser thr thr val thr 
?/l/24l 

gac ata aac ttc agt ctg gca gaa cga ccc gca aac gag acc agg ttc gac ttc cag 

asp ile asn phe ser I u ata glu arg pro ala asn glu thr arg phe asp phe gin 
leu 
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WHAT IS CLAIMED IS: 

A method of cleaving a fusion protein consisting of a 
first component comprising all or part of a 
Caulobacter S- layer protein including a Caulobacter 
C- terminal secretion signal, and a second component 
comprising a heterologous polypeptide expressed and 
secreted from Caulobacter wherein the fusion protein 
comprises at least one aspartate -proline dipeptide, 
and wherein the method comprises the step of combining 
said fusion protein with an acid solution of a 
strength insufficient to solubilize the fusion protein 
for a time sufficient for cleavage of the fusion 
protein at said aspartate-proline dipeptide. 

The method of claim 1 wherein the at least one 
dipeptide is situated between the first and second 
components or adjacent a junction between the first 
and second components. 

The method of claim 1 wherein the acid solution has a 
pH of from about 1.5 to about 2.5. 

The method of claim 1 wherein the acid solution has a 
pH of about 1.65 to about 2.35. 

The method of any one of claims 1-4 wherein the method 
is carried out at a temperature in the range of about 
3 0° C. to about 50*' C. 

The method of any one of claims 1-5 comprising the 
additional step of separating products cleaved from 
the fusion protein. 

A method of preparing a DNA construct for expression 
of a fusion protein suitable for use in the method of 
claim 1, comprising joining an upstream DNA segment 
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comprising DNA encoding a heterologous protein of 
interest and a downstream DNA segment for a 
Caulobacter C-terminal secretion signal which does not 
encode an aspartate-proline dipeptide, wherein the 
upstream segment comprises a sequence encoding an 
aspartate -proline dipeptide at or near an end of said 
upstream segment to be joined to said downstream 
segment . 

A method of preparing a fusion protein, comprising the 
steps of expressing a DNA construct prepared as 
described in claim 7 in Caulobacter and, recovering 
said fusion protein once secreted by the Caulobacter. 



